Skip to content

Fix #1995: Use StrEnum for enum-like strings#2016

Merged
mdboom merged 11 commits intoNVIDIA:mainfrom
mdboom:str-enums
May 5, 2026
Merged

Fix #1995: Use StrEnum for enum-like strings#2016
mdboom merged 11 commits intoNVIDIA:mainfrom
mdboom:str-enums

Conversation

@mdboom
Copy link
Copy Markdown
Contributor

@mdboom mdboom commented May 4, 2026

See discussion in #1995.

@mdboom mdboom added this to the cuda.core v1.0.0 milestone May 4, 2026
@mdboom mdboom requested a review from rwgk May 4, 2026 18:58
@mdboom mdboom self-assigned this May 4, 2026
@mdboom mdboom added cuda.core Everything related to the cuda.core module P1 Medium priority - Should do labels May 4, 2026
@github-actions

This comment has been minimized.

@rwgk
Copy link
Copy Markdown
Contributor

rwgk commented May 4, 2026

PR 2016 Agent Review

Reviewer: Opus 4.7 1M (Max), 2026-05-04

With a few small manual edits. I read through the entire review but only checked a few of the many details.


Context

This PR is the second half of the cuda.core v1.0.0 enum hardening that PR #2014 started. Where 2014 wrapped NVML enums for cuda.core.system, this PR introduces eleven enum.StrEnum types for the compilation/linking pipeline and the memory APIs:

  • _program.pyx: SourceType, CodeType, CompilerBackend, PchStatus
  • _memory/_virtual_memory_resource.py: VirtualMemoryAllocationType, VirtualMemoryLocationType, VirtualMemoryHandleType, VirtualMemoryGranularityType, VirtualMemoryAccessType
  • _memory/_managed_memory_resource.pyx: ManagedMemoryLocationType
  • graph/_graph_definition.pyx: GraphMemoryType

It uses from enum import StrEnum with a backports.strenum fallback for Python < 3.11 (added to both pyproject.toml and pixi.toml). Argument-side methods accept either the StrEnum member or a plain string; return-side values are still mostly plain strings, matching the migration approach proposed in #1995.


Blocking Issues

These should be addressed in this PR.

1. MemoryLocationType is undefined (typo)

cuda_core/cuda/core/_memory/_managed_memory_resource.pyx:110

def preferred_location(self) -> tuple[MemoryLocationType, int | None] | None:

There is no symbol named MemoryLocationType anywhere in the workspace. The intended name is ManagedMemoryLocationType (introduced in this same PR). Because the file uses from __future__ import annotations, the typo is silent at import time, but typing.get_type_hints(...), IDE inference, and Sphinx autodoc all raise NameError when resolving the annotation.

Fix: rename to ManagedMemoryLocationType (already in scope at line 26).

2. Program.backend / Program.pch_status claim enum return types but return str

cuda_core/cuda/core/_program.pyx:178-180 and :151-175

The properties advertise -> CompilerBackend and -> PchStatus | None, but:

  • _program.pxd declares cdef str _backend and cdef str _pch_status.
  • _backend is set with self._backend = str(CompilerBackend.NVRTC) at lines 684, 694, 728.
  • _pch_status is set with str(status) at line 862, but with the bare enum (PchStatus.FAILED) at line 884.

Net result: program.backend is always a plain str; program.pch_status is sometimes a plain str and sometimes a PchStatus, depending on the code path. Compare with Linker.backend (_linker.pyx:171-174), which now returns a real CompilerBackend member — so two related public properties advertising the same return type behave differently.

Fix: pick one direction and apply it consistently across Program.backend, Program.pch_status, and Linker.backend. Recommended:

  • Change cdef str _backend / cdef str _pch_status in _program.pxd to cdef object.
  • Drop the str(...) wrappers in _program.pyx (lines 684, 694, 728, 862).
  • Keep the doc-stated enum return types.

This makes isinstance(p.backend, CompilerBackend) and isinstance(p.pch_status, PchStatus) actually true. All existing test assertions (program.backend == "NVRTC", etc.) continue to pass via StrEnum equality.

3. New public enums are documented under api_private.rst

cuda_core/docs/source/api_private.rst:24-34

All eleven new enums (CodeType, CompilerBackend, GraphMemoryType, ManagedMemoryLocationType, PchStatus, SourceType, VirtualMemory*Type) are listed in api_private.rst, which carries :orphan: and is described in its own header comment as "for private classes exposed to users". These enums are reachable as cuda.core.CodeType etc., live in cuda.core.__all__, and are named in user-facing docstrings — they are public API.

Fix: move them into api.rst, ideally as a new "Enums" subsection within each owning section:

  • CodeType, CompilerBackend, PchStatus, SourceType under "CUDA compilation toolchain".
  • GraphMemoryType under "CUDA graphs".
  • VirtualMemory*Type and ManagedMemoryLocationType under "Memory management".

Mirror the layout the system module already uses in api.rst:204-229 ("Enums" subsection under "CUDA system information…").


Recommended Polish (this PR or a quick follow-up)

These are smaller items the author may want to roll into the same PR while touching the files.

  • Stale comment, _program.pyx:943: # Keys here should match _utils.enum.CompilerBackend references a module path that does not exist (only _utils/enum_explanations_helpers.py is in the tree). Either move CompilerBackend (and likely CodeType/SourceType) into a shared module to also remove the lazy import in Linker.backend, or drop the comment.
  • Duplicated import shim: the try: from enum import StrEnum except ImportError: from backports.strenum import StrEnum block is copy-pasted in five files. Hoist to a single helper (e.g. cuda.core._utils._strenum) so the shim can be removed in one place when 3.10 support ends.
  • Dead __doc__ mutation, _program.pyx:69: SourceType.CXX.__doc__ = "C++ source code." is fragile (some Enum implementations make members read-only) and inconsistent — only one member of one enum gets a doc string. Drop it and document the "c++" value in the class docstring instead.
  • Tuple → list message change, _program.pyx:732: supported_code_types = [x.value for x in SourceType] changes the user-visible error message from (...) to [...]. The test was patched to a regex that accepts both (test_program.py:330), but it's cleaner to keep the tuple form: tuple(x.value for x in SourceType).
  • Stale assert, _program.pyx:733: assert code_type not in supported_code_types is trivially true at this point and reads as a leftover from earlier code. Remove.
  • Misleading tuple[ManagedMemoryLocationType, ...] annotation (post-fix to issue 1): once the typo is corrected, the property still returns plain strings (because _resolve_preferred_location stores plain "device" etc. into _pref_loc_type). Either weaken the annotation back to tuple[str, int | None] | None or promote _pref_loc_type to actually hold the enum.

Test Gaps

Worth adding while in this PR; cheap and they catch the regression that the blocking-issue fixes are guarding against.

  • assert isinstance(program.backend, CompilerBackend) somewhere in test_program.py. Currently every test only checks string equality, so a future regression that silently re-introduces str(...) would not be caught.
  • Same for program.pch_status returning PchStatus.
  • Parametrized coverage of SourceType.PTX and SourceType.NVVM in Program.__init__ (only SourceType.CXX is exercised in test_launch_invalid_values).
  • ProgramOptions.as_bytes(CompilerBackend.NVRTC) (the function does backend.lower(); works, but not currently tested with enum input).
  • For GraphAllocOptions.memory_type: parametrize _build_alloc_managed_node to also exercise enum input for DEVICE and HOST, plus string input for all three, so both halves of the "either is OK" promise are covered.

Suggested Follow-on PRs

The two items below are direct extensions of this work. They are scoped cleanly and would be easier to review separately than to fold in here.

Follow-on 1: ConditionalNode.cond_type enum

Background. Issue #1995 explicitly lists ConditionalNode.cond_type (returning "if", "while", "switch") as a string-vocabulary candidate for an enum, alongside everything PR 2016 already addresses. The current implementation lives at cuda_core/cuda/core/graph/_subclasses.pyx:701 and still returns plain str. It is not touched by PR 2016. Since #1995 is P0 on the v1.0.0 milestone (due 2026-05-07), closing this gap before 1.0.0 keeps the issue closable.

Suggested follow-on PR.

  • Add a ConditionalType(StrEnum) next to the other graph enums in cuda.core.graph._graph_definition (or a new _conditional.pyx if you prefer to keep _subclasses.pyx lean).
  • Members: IF = "if", WHILE = "while", SWITCH = "switch". (None of these are valid Python identifiers, so the NAME = "value" pattern matches what SourceType.CXX = "c++" already does in this PR.)
  • Update _subclasses.pyx to return ConditionalType instead of str from ConditionalNode.cond_type.
  • Add it to cuda.core.graph.__all__ and re-export from cuda.core.
  • Add an isinstance test mirroring whatever pattern the team picks for Program.backend in this PR.

This follow-on is small (~30 lines + tests) and directly closes the remaining #1995 surface.

Follow-on 2: bindings-vs-core enum sync test

Background. PR 2014's review surfaced a structural risk that wrappers in cuda_core can drift from auto-generated cuda_bindings enums when NVML adds new values. The flip side applies to PR 2016: most of the new StrEnums here wrap driver-level C enums (CUmemAllocationType, CUmemAllocationHandleType, CUmemLocationType, CUmemAllocationGranularity_flags, CUmemAccess_flags, nvrtcResult, …) selectively, not exhaustively. When the CUDA driver introduces a new value (e.g. a future CU_MEM_HANDLE_TYPE_*), the wrapper StrEnum will silently fall behind and a user calling, say, VirtualMemoryResourceOptions(handle_type=...) with the new value would have to bypass the enum to use it — defeating the type-safety goal.

This is not a blocker for v1.0.0 (today's enums match today's bindings), but it is the kind of debt that compounds across CTK releases.

Suggested follow-on PR. Add a parametrized sync test (e.g. cuda_core/tests/test_enum_sync.py) that imports the relevant cuda.bindings.driver / cuda.bindings.nvrtc enums and asserts the wrapper StrEnum value-set is a known subset, with a small allow-list for intentional omissions:

@pytest.mark.parametrize("wrapper, source, omitted", [
    (VirtualMemoryAllocationType, cydriver.CUmemAllocationType,
     {"CU_MEM_ALLOCATION_TYPE_INVALID", "CU_MEM_ALLOCATION_TYPE_MAX"}),
    (VirtualMemoryHandleType, cydriver.CUmemAllocationHandleType,
     {"CU_MEM_HANDLE_TYPE_NONE", "CU_MEM_HANDLE_TYPE_MAX"}),
    (VirtualMemoryLocationType, cydriver.CUmemLocationType,
     {"CU_MEM_LOCATION_TYPE_INVALID", "CU_MEM_LOCATION_TYPE_NONE",
      "CU_MEM_LOCATION_TYPE_MAX"}),
    # ...etc.
])
def test_strenum_covers_bindings(wrapper, source, omitted):
    bindings_values = {m.name for m in source} - omitted
    wrapper_values = {m.name.upper() for m in wrapper}  # or a normalization fn
    missing = bindings_values - wrapper_values
    assert not missing, f"{wrapper.__name__} is missing: {missing}"

The test fails on the cuda-bindings update PR that adds a new driver value, prompting an explicit decision (extend the wrapper, or extend the omitted set with a comment). This is exactly the kind of thing that would have caught the gap quietly during the 12.x → 13.x bindings refresh.

A version of this test should also be applied to the NVML wrappers introduced in PR 2014, so the sync guard covers both halves of the v1.0.0 enum work.


Notes on Items NOT Addressed in This Review

  • MemoryType unification from cuda.core: Replace string literals with enums in public API #1995. Issue 1995 proposed a single shared MemoryType enum for device/host/managed. PR 2016 splits it into three (GraphMemoryType, ManagedMemoryLocationType, VirtualMemoryLocationType) — reasonable for documentation provenance, but worth a one-line note in the PR description so reviewers don't expect the unified form. (Could also be revisited in a future PR by deriving them from a shared base or making them aliases.)

@mdboom
Copy link
Copy Markdown
Contributor Author

mdboom commented May 4, 2026

  1. New public enums are documented under api_private.rst
    cuda_core/docs/source/api_private.rst:24-34

All eleven new enums (CodeType, CompilerBackend, GraphMemoryType, ManagedMemoryLocationType, PchStatus, SourceType, VirtualMemory*Type) are listed in api_private.rst, which carries :orphan: and is described in its own header comment as "for private classes exposed to users". These enums are reachable as cuda.core.CodeType etc., live in cuda.core.all, and are named in user-facing docstrings — they are public API.

This is to address @leofang's concern offline about "class explosion" from using enums. By documenting them as linked from functions that use the enums, and not putting them at the top-level table of contents, I think this helps quite a bit with that concern. If anything the api_private.rst (which is just a hack to document things not in the table of contents) is now mis-named and maybe we address that separately.

Duplicated import shim: the try: from enum import StrEnum except ImportError: from backports.strenum import StrEnum block is copy-pasted in five files. Hoist to a single helper (e.g. cuda.core._utils._strenum) so the shim can be removed in one place when 3.10 support ends.

Let's separate that out. Not required for 1.0

#2019

Dead doc mutation, _program.pyx:69: SourceType.CXX.doc = "C++ source code." is fragile (some Enum implementations make members read-only) and inconsistent — only one member of one enum gets a doc string. Drop it and document the "c++" value in the class docstring instead.

That suggestion does not work with our current Sphinx setup, and it is also not as ergonomic at the terminal or Jupyter.

Sync tests

As with #2014, I'll experiment with what sync testing would look like, but it is not critical for 1.0

Copy link
Copy Markdown
Contributor

@rwgk rwgk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only two tiny fixes. The rest looks good to me.

Comment thread cuda_core/cuda/core/_program.pyx Outdated
Comment thread cuda_core/cuda/core/_memory/_managed_memory_resource.pyx Outdated
Comment thread cuda_core/cuda/core/graph/_conditional_type.py Outdated
Comment thread cuda_core/cuda/core/_module.pyx Outdated
Comment on lines +570 to +573
CodeTypeT = bytes | bytearray | str

cdef tuple _supported_code_type = ("cubin", "ptx", "ltoir", "fatbin", "object", "library")
cdef tuple _supported_code_type = tuple(CodeType.__members__.values())
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, when reading this side-by-side it is easy to catch: We have both CodeTypeT and CodeType meaning two very different things, we probably should fix this

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would CodeFormat (for the enum with "cubin", "ptx" etc.) make sense?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe ObjectCodeFormatT? The ObjectCode class can be constructed from any supported format.

Another naming convention discussion: Our type annotations have been made a bit C++-ish (but consistently) by appending a suffix T: IsStreamT, CodeTypeT, .... Maybe a good time to decide a convention and stick to it. My only concern (as raised elsewhere) is we might have both actual classes and associated type annotations refer to overlapping or orthogonal concepts. Would be nice to avoid ambiguity.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In think in this case, enums are not type annotations (they have members and runtime semantics and such), so shouldn't have the T. I know we are putting them in typing.

Comment thread cuda_core/cuda/core/_program.pyx Outdated
Comment thread cuda_core/cuda/core/__init__.py Outdated
Comment thread cuda_core/cuda/core/__init__.py Outdated
@leofang leofang added the enhancement Any code-related improvements label May 5, 2026
@mdboom mdboom requested a review from leofang May 5, 2026 14:45
@mdboom
Copy link
Copy Markdown
Contributor Author

mdboom commented May 5, 2026

This got a pretty significant rework based on @leofang's feedback. All the new enums are defined in cuda.core.typing now. The suggested renames of the enums was applied. And this ties into the testing added in #2014 that ensures that new enum values added upstream will be detected. There were a few bugfixes required to that new testing infrastructure to make work, which then revealed a couple small bugs in #2014, fixed here.

Comment thread cuda_core/tests/graph/test_graph_definition.py
Comment thread cuda_core/tests/test_launcher.py
Comment thread cuda_core/cuda/core/typing.py Outdated
Comment thread cuda_core/cuda/core/typing.py Outdated
Comment thread cuda_core/cuda/core/typing.py Outdated
Comment thread cuda_core/cuda/core/typing.py Outdated
Comment thread cuda_core/cuda/core/typing.py Outdated
Comment thread cuda_core/cuda/core/typing.py
Copy link
Copy Markdown
Member

@leofang leofang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall! Left a bunch of minor comments mainly for consistency.

@leofang leofang added P0 High priority - Must do! breaking Breaking changes are introduced and removed P1 Medium priority - Should do labels May 5, 2026
@mdboom
Copy link
Copy Markdown
Contributor Author

mdboom commented May 5, 2026

LGTM overall! Left a bunch of minor comments mainly for consistency.

Ok. I renamed everything so it has a consistent Type suffix for all of these type-annotation-like things.

@mdboom mdboom requested a review from leofang May 5, 2026 15:54
Copy link
Copy Markdown
Member

@leofang leofang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will need a release note entry, feel free to add it in this PR or the next one. Approving.

@mdboom mdboom enabled auto-merge (squash) May 5, 2026 16:24
@mdboom mdboom merged commit cec0efb into NVIDIA:main May 5, 2026
94 checks passed
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 5, 2026

Doc Preview CI
Preview removed because the pull request was closed or merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

breaking Breaking changes are introduced cuda.core Everything related to the cuda.core module enhancement Any code-related improvements P0 High priority - Must do!

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants